Full expansion of context-dependent networks in large vocabulary speech recognition

نویسندگان

Mehryar Mohri

Michael Riley

Donald Hindle

Andrej Ljolje

Fernando Pereira

چکیده

We combine our earlier approach to context-dependent network representation with our algorithm for determinizing weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fullyexpanded networks have been used before in restrictive settings (medium vocabulary or no cross-word contexts), we demonstrate that our network determinization method makes it practical to use fully-expanded networks also in large-vocabulary recognition with full cross-word context modeling. For the DARPA North American Business News task (NAB), we give network sizes and recognition speeds and accuracies using bigram and trigram grammars with vocabulary sizes ranging from 10,000 to 160,000 words. With our construction, the fully-expanded NAB context-dependent networks contain only about twice as many arcs as the corresponding language models. Interestingly, we also find that, with these networks, real-time word accuracy is improved by increasing vocabulary size and n-gram order.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Pre-initialized composition for large-vocabulary speech recognition

This paper describes a modified composition algorithm that is used for combining two finite-state transducers, representing the context-dependent lexicon and the language model respectively, in large vocabulary speech recogntion. This algorithm is a hybrid between the static and dynamic expansion of the resultant transducer, which maps from context-dependent phones to words and is searched duri...

متن کامل

Large vocabulary Speech Recognition System: SPOJUS++

In this paper, we describe Large vocabulary Continuous Speech Recognition (LVCSR) system SPOJUS++ which has been developed in our laboratory for over 20 years and recently fully reimplemented from scratch. SPOJUS++ employs a context-dependent Hidden Markov Model (HMM) as an acoustic model and an N-gram model as a language model to decode speech. SPOJUS++ has many novel features including a dyna...

متن کامل

Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Some applications of speech recognition, such as automatic directory information services, require very large vocabularies. In this paper, we focus on the task of recognizing surnames in an Interactive telephonebased Directory Assistance Services (IDAS) system, which supersedes other large vocabulary applications in terms of complexity and vocabulary size. We present a method for building compa...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Full expansion of context-dependent networks in large vocabulary speech recognition

نویسندگان

چکیده

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Pre-initialized composition for large-vocabulary speech recognition

Large vocabulary Speech Recognition System: SPOJUS++

Large Vocabulary Search Space Reduction Employing Directed Acyclic Word Graphs and Phonological Rules

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

عنوان ژورنال:

اشتراک گذاری